The Di culty of Reduced Error Pruning ofLeveled Branching

نویسنده

  • Tapio Elomaa
چکیده

Induction of decision trees is one of the most successful approaches to supervised machine learning. Branching programs are a generalization of decision trees and, by the boosting analysis, exponentially more eeciently learnable than decision trees. In experiments this advantage has not been seen to materialize. Decision trees are easy to simplify using pruning. For branching programs no such algorithms are known. We prove that reduced error pruning of branching programs is infeasible. Finding the optimal pruning of a branching program with respect to a set of pruning examples that is separate from the set of training examples is NP-complete. Therefore, we are forced to consider approximate solutions to this problem. We also prove that nding an approximate solution of arbitrary accuracy is computationally intractable. In particular, the reduced error pruning of branching programs is APX-hard.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Small Trees and Graphs that Generalize

In this Thesis we study issues related to learning small tree and graph formed classifiers. First, we study reduced error pruning of decision trees and branching programs. We analyze the behavior of a reduced error pruning algorithm for decision trees under various probabilistic assumptions on the pruning data. As a result we get, e.g., new upper bounds for the probability of replacing a tree t...

متن کامل

An Analysis of Reduced Error Pruning

Top-down induction of decision trees has been observed to su er from the inadequate functioning of the pruning phase. In particular, it is known that the size of the resulting tree grows linearly with the sample size, even though the accuracy of the tree does not improve. Reduced Error Pruning is an algorithm that has been used as a representative technique in attempts to explain the problems o...

متن کامل

Reduced Error Pruning of branching programs cannot be approximated to within a logarithmic factor

In this paper, we prove under a plausible complexity hypothesis that Reduced Error Pruning of branching programs is hard to approximate within log1−δ n, for every δ > 0, where n is the number of description variables, a measure of the problem’s complexity. The result holds under the assumption that NP problems do not admit deterministic, slightly superpolynomial time algorithms: NP ⊂ TIME(|I |O...

متن کامل

Appears in Ecml-98 as a Research Note Pruning Decision Trees with Misclassiication Costs 1 Pruning Decision Trees

We describe an experimental study of pruning methods for decision tree classi ers when the goal is minimizing loss rather than error. In addition to two common methods for error minimization, CART's cost-complexity pruning and C4.5's error-based pruning, we study the extension of cost-complexity pruning to loss and one pruning variant based on the Laplace correction. We perform an empirical com...

متن کامل

Pruning Decision Trees with Misclassi cation Costs 08 - FEB - 1998

We describe an experimental study of pruning methods for decision tree classi ers in two learning situations: minimizing loss and probability estimation. In addition to the two most common methods for error minimization, CART's cost-complexity pruning and C4.5's errorbased pruning, we study the extension of cost-complexity pruning to loss and two pruning variants based on Laplace corrections. W...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002